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^-H , Abstract 

o. 

p\J ■ Self-stabilization is a versatile approach to fault-tolerance since it permits a distributed 

►^.^1 system to recover from any transient fault that arbitrarily corrupts the contents of all memories 

C^ ■ in the system. Byzantine tolerance is an attractive feature of distributed systems that permits 

^^ I to cope with arbitrary malicious behaviors. 

' ■ We consider the well known problem of constructing a breadth-first spanning tree in this 

OO I context. Combining these two properties proves difficult: we demonstrate that it is impossible 

^^ ■ to contain the impact of Byzantine nodes in a strictly or strongly stabilizing manner. We 

,—^1 then adopt the weaker scheme of topology-aware strict stabilization and we present a similar 

C) . weakening of strong stabilization. We prove that the classical min + 1 protocol has optimal 

^^ I Byzantine containment properties with respect to these criteria. 

c/3 ! 
^ O/ Keywords Byzantine fault, Distributed protocol, Fault tolerance. Stabilization, Spanning tree 

construction 

>; 
Sn ■ 1 Introduction 

(N ■ 

ly^ ' The advent of ubiquitous large-scale distributed systems advocates that tolerance to various kinds of 

ly-C . faults and hazards must be included from the very early design of such systems. Self-stabilization [H 

^D I El [H] is a versatile technique that permits forward recovery from any kind of transient faults, 

^— ^ ■ while Byzantine Fault-tolerance [13] is traditionally used to mask the effect of a limited number 

• • . of malicious faults. Making distributed systems tolerant to both transient and malicious faults is 

• 5^ I appealing yet proved difficult [71[2l[T7] as impossibility results are expected in many cases. 

p\ ' Two main paths have been followed to study the impact of Byzantine faults in the context of 

S . self-stabilization: 

• Byzantine fault masking. In completely connected synchronous systems, one of the most 
studied problems in the context of self-stabilization with Byzantine faults is that of clock 
synchronization. In [Tl[7], probabilistic self-stabilizing protocols were proposed for up to one 
third of Byzantine processes, while in [5l[T2] deterministic solutions tolerate up to one fourth 
and one third of Byzantine processes, respectively. 
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• Byzantine containment. For local tasks (i.e. tasks whose correctness can be checked locally, 
such as vertex coloring, link coloring, or dining philosophers), the notion of strict stabilization 
was proposed [T71 [HI [16]. Strict stabilization guarantees that there exists a containment 
radius outside which the effect of permanent faults is masked, provided that the problem 
specification makes it possible to break the causality chain that is caused by the faults. As 
many problems are not local, it turns out that it is impossible to provide strict stabilization 
for those. 

Our Contribution. In this paper, we investigate the possibility of Byzantine containment in a 
self-stabilizing setting for tasks that are global (i.e. there exists a causality chain of size r, where r 
depends on n the size of the network), and focus on a global problem, namely breadth- first spanning 
tree construction. A good survey on self-stabilizing solutions to this problem can be found in |11] . 
In particular, one of the simplest solution is known under the name of min + 1 protocol (see |13j). 
This name is due to the construction of the protocol itself. Each process has two variables: one 
pointer to its parent in the tree and one level in this tree. The protocol is reduced to the following 
rule: each process chooses as its parent the neighbor which has the smallest level (min part) and 
updates its level in consequence (-1-1 part). [13] proves that this protocol is self-stabilizing. In this 
paper, we propose a complete study of Byzantine containment properties of this protocol. 

In a first time, we study space Byzantine containment properties of this protocol. As strict 
stabilization is impossible with such global tasks (see [E]), we use the weaker scheme of topology- 
aware strict stabilization (see [9]). In this scheme, we weaken the containment constraint by relaxing 
the notion of containment radius to containment area, that is Byzantine processes may disturb 
infinitely often a set of processes which depends on the topology of the system and on the location 
of Byzantine processes. We show that the min + 1 protocol has optimal containment area with 
respect to topology-aware strict stabilization. 

In a second time, we study time Byzantine containment properties of this protocol using the 
concept of strong stabilization (see [151 [8]). We first show that it is impossible to find a strongly 
stabilizing solution to the BFS tree construction problem. It is why we weaken the concept of strong 
stabilization using the notion of containment area to obtain topology-aware strong stabilization. We 
show then that the min -\- 1 protocol has also optimal containment area with respect to topology- 
aware strong stabilization. 

2 Distributed System 

A distributed system S = {P,L) consists of a set P = {vi,V2, . . . ,Vn} of processes and a set L 
of bidirectional communication links (simply called links). A link is an unordered pair of distinct 
processes. A distributed system S can be regarded as a graph whose vertex set is P and whose 
link set is L, so we use graph terminology to describe a distributed system S. We use the following 
notations: n = \P\ and m = \L\. 

Processes u and v are called neighbors if (u, v) G L. The set of neighbors of a process v is 
denoted by N^, and its cardinality (the degree of v) is denoted by A„(= \Ny\). The degree A of a 
distributed system S = {P,L) is defined as A = max{A^ | v G P}. We do not assume existence of 
a unique identifier for each process. Instead we assume each process can distinguish its neighbors 
from each other by locally arranging them in some arbitrary order: the k-th neighbor of a process 
V is denoted by N^{k) {1 < k < Ay). 



In this paper, we consider distributed systems of arbitrary topology. We assume that a single 
process is distinguished as a root, and all the other processes are identical. 

We adopt the shared state model as a communication model in this paper, where each process 
can directly read the states of its neighbors. 

The variables that are maintained by processes denote process states. A process may take 
actions during the execution of the system. An action is simply a function that is executed in an 
atomic manner by the process. The actions executed by each process is described by a finite set 
of guarded actions of the form (guard) — > (statement). Each guard of process n is a boolean 
expression involving the variables of u and its neighbors. 

A global state of a distributed system is called a configuration and is specified by a product 
of states of all processes. We define C to be the set of all possible configurations of a distributed 
system S. For a process set R (^ P and two configurations p and p', we denote p >-^ p' when p 
changes to p' by executing an action of each process in R simultaneously. Notice that p and p' 
can be different only in the states of processes in R. For completeness of execution semantics, we 
should clarify the configuration resulting from simultaneous actions of neighboring processes. The 
action of a process depends only on its state at p and the states of its neighbors at p, and the result 
of the action refiects on the state of the process at p' . 

We say that a process is enabled in a configuration p if the guard of at least one of its actions 
is evaluated as true in p. 

A schedule of a distributed system is an infinite sequence of process sets. Let Q = R^,R^, . . .hea 
schedule, where R^ (^ P holds for each i {i > 1). An infinite sequence of configurations e = po, pi, ■ ■ ■ 

is called an execution from an initial configuration po by a schedule Q, if e satisfies pi-i >-^ pi for 
each i [i > 1). Process actions are executed atomically, and we also assume that a distributed 
daemon schedules the actions of processes, i.e. any subset of processes can simultaneously execute 
their actions. We say that the daemon is central if it schedules action of only one process at any 
step. 

The set of all possible executions from pQ ^ C \s denoted by Ep^. The set of all possible 
executions is denoted by E, that is, E = UpGC ^p- ^^ consider asynchronous distributed systems 
where we can make no assumption on schedules except that any schedule is fair: a process which 
is infinitely often enabled in an execution can not be never activated in this execution. 

In this paper, we consider (permanent) Byzantine faults: a Byzantine process [i.e. a Byzantine- 
faulty process) can make arbitrary behavior independently from its actions. If t> is a Byzantine 
process, v can repeatedly change its variables arbitrarily. 

3 Self- Stabilizing Protocol Resilient to Byzantine Faults 

Problems considered in this paper are so-called static problems, i.e. they require the system to 
find static solutions. For example, the spanning-tree construction problem is a static problem, 
while the mutual exclusion problem is not. Some static problems can be defined by a specification 
predicate (shortly, specification), spec{v), for each process v: a configuration is a desired one (with 
a solution) if every process satisfies spec{v). A specification spec{v) is a boolean expression on 
variables of P^ (C P) where Py is the set of processes whose variables appear in spec{v). The 
variables appearing in the specification are called output variables (shortly, 0-variables). In what 
follows, we consider a static problem defined by specification spec{v). 



A self- stabilizing protocol ([5) is a protocol that eventually reaches a legitimate configuration, 
where spec{v) holds at every process v, regardless of the initial configuration. Once it reaches a 
legitimate configuration, every process never changes its 0-variables and always satisfies spec{v). 
From this definition, a self-stabilizing protocol is expected to tolerate any number and any type 
of transient faults since it can eventually recover from any configuration affected by the transient 
faults. However, the recovery from any configuration is guaranteed only when every process cor- 
rectly executes its action from the configuration, i.e., we do not consider existence of permanently 
faulty processes. 

3.1 Strict stabilization 

When (permanent) Byzantine processes exist, Byzantine processes may not satisfy spec{v). In 
addition, correct processes near the Byzantine processes can be influenced and may be unable to 
satisfy spec{v). Nesterenko and Arora [T7] define a strictly stabilizing protocol as a self-stabilizing 
protocol resilient to unbounded number of Byzantine processes. 

Given an integer c, a c-correct process is a process defined as follows. 

Definition 1 (c-correct process) A process is c-correct if it is correct (i.e. not Byzantine) and 
located at distance more than c from any Byzantine process. 

Definition 2 ((c, /)-containnient) A configuration p is (c, /)-contained for specification spec if, 
given at most f Byzantine processes, in any execution starting from p, every c-correct process v 
always satisfies spec{v) and never changes its 0-variables. 

The parameter c of Definition [2] refers to the containment radius defined in [1^ . The parameter 
/ refers explicitly to the number of Byzantine processes, while [T7] dealt with unbounded number 
of Byzantine faults (that is / € {0 . . . n}). 

Definition 3 ((c, /)-strict stabilization) A protocol is [c, f)-siTici\y stahWi'zm.g for specification 
spec if, given at most f Byzantine processes, any execution e = po,pi,. . . contains a configuration 
Pi that is (c, f) -contained for spec. 

An important limitation of the model of [T^ is the notion of r-restrictive specifications. In- 
tuitively, a specification is r-restrictive if it prevents combinations of states that belong to two 
processes u and v that are at least r hops away. An important consequence related to Byzantine 
tolerance is that the containment radius of protocols solving those specifications is at least r. For 
some problems, such as the breadth-first search (BFS) spanning tree construction we consider in 
this paper, r can not be bounded by a constant. In consequence, we can show that there exists no 
(c, l)-strictly stabilizing protocol for the breadth-first search (BFS) spanning tree construction for 
any (finite) integer c. 

3.2 Strong stabilization 

To circumvent the impossibility result, [15] defines a weaker notion than the strict stabilization. 
Here, the requirement to the containment radius is relaxed, i.e. there may exist processes outside the 
containment radius that invalidate the specification predicate, due to Byzantine actions. However, 
the impact of Byzantine triggered action is limited in times: the set of Byzantine processes may 



only impact processes outside the containment radius a bounded number of times, even if Byzantine 
processes execute an infinite number of actions. 

In the following of this section, we recall the formal definition of strong stabilization adopted in 
p]. From the states of c-correct processes, c-legitimate configurations and c-stable configurations 
are defined as follows. 

Definition 4 (c-legitimate configuration) A configuration p is c-legitimate for spec if every 
c-correct process v satisfies spec{v). 

Definition 5 (c-stable configuration) A configuration p is c-stable if every c-correct process 
never changes the values of its 0-variahles as long as Byzantine processes make no action. 

Roughly speaking, the aim of self-stabilization is to guarantee that a distributed system even- 
tually reaches a c-legitimate and c-stable configuration. However, a self-stabilizing system can be 
disturbed by Byzantine processes after reaching a c-legitimate and c-stable configuration. The 
c-disruption represents the period where c-correct processes are disturbed by Byzantine processes 
and is defined as follows 

Definition 6 (c-disruption) A portion of execution e = po,Pi, ■ ■ ■ , Pt (t > 1) is a c-disruption if 
and only if the following holds: 

1. e is finite, 

2. e contains at least one action of a c-correct process for changing the value of an 0-variahle, 

3. pq is c-legitimate for spec and c-stable, and 

4- Pt is the first configuration after po such that pt is c-legitimate for spec and c-stable. 

Now we can define a self-stabilizing protocol such that Byzantine processes may only impact 
processes outside the containment radius a bounded number of times, even if Byzantine processes 
execute an infinite number of actions. 

Definition 7 ((t, A;, c, /)-time contained configuration) A configuration po is {t,k,c, f)-time 
contained for spec if given at most f Byzantine processes, the following properties are satisfied: 

1. po is c-legitimate for spec and c-stable, 

2. every execution starting from pQ contains a c-legitimate configuration for spec after which the 
values of all the 0-variahles of c-correct processes remain unchanged (even when Byzantine 
processes make actions repeatedly and forever), 

3. every execution starting from po contains at most t c- disruptions, and 

4. every execution starting from po contains at most k actions of changing the values of O- 
variables for each c-correct process. 



Definition 8 ((t, c, /)-strongly stabiUzing protocol) A protocol A is {t,c, f) -strongly stabiliz- 
ing if and only if starting from any arbitrary configuration, every execution involving at most f 
Byzantine processes contains a {t,k,c, f)-time contained configuration that is reached after at most 
I rounds. Parameters I and k are respectively the (t,c, f) -stabilization time and the {t,c, f) -process- 
disruption times of A. 

Note that a (t, k, c, /)-time contained configuration is a (c, /)-contained configuration when 
t = A; = 0, and thus, (t, k, c, /)-tinie contained configuration is a generahzation (relaxation) of 
a (c, /)-contained configuration. Thus, a strongly stabilizing protocol is weaker than a strictly 
stabilizing one (as processes outside the containment radius may take incorrect actions due to 
Byzantine influence). However, a strongly stabilizing protocol is stronger than a classical self- 
stabilizing one (that may never meet their specification in the presence of Byzantine processes) . 

The parameters t, k and c are introduced to quantify the strength of fault containment, we do 
not require each process to know the values of the parameters. 

4 Topology-aware Byzantine resilience 

4.1 Topology-aware strict stabilization 

In Section 13. !( we saw that there exist a number of impossibility results on strict stabilization due 
to the notion of r-restrictives specifications. To circumvent this impossibility result, we describe 
here a weaker notion than the strict stabilization: the topology-aware strict stabilization (denoted 
by TA strict stabilization for short) introduced by [9]. Here, the requirement to the containment 
radius is relaxed, i.e. the set of processes which may be disturbed by Byzantine ones is not reduced 
to the union of c-neighborhood of Byzantine processes but can be defined depending on the graph 
topology and Byzantine processes location. 

In the following, we give formal definition of this new kind of Byzantine containment. From 
now, B denotes the set of Byzantine processes and Sb (which is function of B) denotes a subset of 
V (intuitively, this set gathers all processes which may be disturbed by Byzantine processes). 

Definition 9 (S'^-correct node) A node is S^-correct if it is a correct node (i.e. not Byzantine) 
which not belongs to Sb. 

Definition 10 (S^-legitimate configuration) A configuration p is Ss-legitimate for spec if ev- 
ery SB-correct node v is legitimate for spec (i.e. if spec{v) holds). 

Definition 11 ((S^, /)-topology-aware containment) A configuration po is (Sb, /)-topology- 
aware contained for specification spec if, given at most f Byzantine processes, in any execution 
e = po,Pi, ■ ■ •; every configuration is SB-legitimate and every SB-correct process never changes its 
0-variables. 

The parameter Sb of Definition [11] refers to the containment area. Any process which belongs 
to this set may be infinitely disturbed by Byzantine processes. The parameter / refers explicitly 
to the number of Byzantine processes. 

Definition 12 ((5^, /)-topology-aware strict stabilization) A protocol is (5*5, /)-topology- 
aware strictly stabilizing for specification spec if, given at most f Byzantine processes, any execution 
e = po,pi, . . . contains a configuration pi that is {Sb, f) -topology-aware contained for spec. 
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Note that, if B denotes the set of Byzantine processes and 5^ = < u € V\min{d{v,b)) < c 

then a (5b, /)-topology-aware strictly stabihzing protocol is a (c, /)-strictly stabilizing protocol. 
Then, the concept of topology-aware strict stabilization is a generalization of the strict stabilization. 
However, note that a TA strictly stabilizing protocol is stronger than a classical self-stabilizing 
protocol (that may never meet their specification in the presence of Byzantine processes). 

The parameter Sb is introduced to quantify the strength of fault containment, we do not require 
each process to know the actual definition of the set. Actually, the protocol proposed in this paper 
assumes no knowledge on the parameter. 

4.2 Topology-aware strong stabilization 

Similarly to topology-aware strict stabilization, we can weaken the notion of strong stabilization 
using the notion of containment area. Then, we obtain the following definition: 

Definition 13 (S^-stable configuration) A configuration p is Sb -stable if every SB-correct pro- 
cess never changes the values of its 0-variables as long as Byzantine processes make no action. 

Definition 14 (S^-TA-disruption) A portion of execution e = po, pi, . . . , pt (t > 1) is a Sb- 
TA-disruption if and only if the fallowings hold: 

1. e is finite, 

2. e contains at least one action of a Ss-correct process for changing the value of an 0-variable, 

3. pq is SB-legitimate for spec and SB-stable, and 

4- Pt is the first configuration after po such that pt is SB-legitimate for spec and Sb -stable. 

Definition 15 ((t, /c, S^, /)-TA time contained configuration) A configuration po is {t,k,SB, 
f)-TA time contained for spec if given at most f Byzantine processes, the following properties are 
satisfied: 

1. Pq is Sb -legitimate for spec and SB-stable, 

2. every execution starting from pQ contains a SB-legitimate configuration for spec after which 
the values of all the 0-variables of SB-correct processes remain unchanged (even when Byzan- 
tine processes make actions repeatedly and forever), 

3. every execution starting from po contains at most t Sb-TA- disruptions, and 

4. every execution starting from po contains at most k actions of changing the values of O- 
variables for each SB-correct process. 

Definition 16 ((t, Ss, /)-TA strongly stabilizing protocol) A protocol A is {t,SB, f)-TA 
strongly stabilizing if and only if starting from any arbitrary configuration, every execution involv- 
ing at most f Byzantine processes contains a {t,k, Sb, f)-TA-time contained configuration that is 
reached after at most I actions of each SB-correct node. Moreover, SB-legitimate configurations are 
closed by actions of A. Parameters I and k are respectively the {t, Sb, f) -stabilization time and the 
{t, Sb, f) -process-disruption time of A. 
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5 BFS Spanning Tree Construction 

In this section, we are interested in the problem of BFS spanning tree construction. That is, the 
system has a distinguished process called the root (and denoted by r) and we want to obtain a BFS 
spanning tree rooted to this root. We made the following hypothesis: the root r is never Byzantine. 
To solve this problem, each process v has two 0-variables: the first is prnty € N^ U {_L} which 
is a pointer to the neighbor that is designated to be the parent of v in the BFS tree and the second 
is levels € {0, . . . , D} which stores the depth (the number of hops from the root) of v in this tree. 
Obviously, Byzantine process may disturb (at least) their neighbors. For example, a Byzantine 
process may act as the root. It is why the specification of the BFS tree construction we adopted 
states in fact that there exists a BFS spanning forest such that any root of this forest is either the 
real root of the system or a Byzantine process. More formally, we use the following specification of 
the problem. 

Definition 17 (BFS path) A path {vq, . . . , Vk) (k > 1) of S is a BFS path if and only if: 

1. prnty^ = ±, leveliig = 0, and vq G B U {r}, 

2. Vi G {1, . . . , k},prnty. = Vi-i and levely^ = levely^_-^ + 1, and 

3. Vi G {1, . . . , /c}, levelu,_i = rnin {levelu}- 

We define the specification predicate spec{v) of the BFS spanning tree construction as follows. 

( prnty = _L and levels = if v is the root r 
speciy) : < 

I there exists a BFS path (t>o, . . . , Vfc) such that Vk = v otherwise 

In the case where any process is correct, note that spec implies the existence of a BFS spanning 
tree rooted to the real root. The well-known min+1 protocol solves this problem in a self-stabilizing 
way (see [E]). In the following of this section, we assume that some processes may be Byzantine and 
we study the Byzantine containment properties of this protocol. We show that this self-stabilizing 
protocol has moreover optimal Byzantine containment properties. 

In more details, we prove first that there exists neither strictly nor strongly stabilizing solution 
to the BFS spanning tree construction (see Theorems [T] and [2]) . Then, we demonstrate in Theorems 
[3] and [3] that the min+ 1 protocol is both (S'b, /)-TA strictly and (t, S*^, /)-TA strongly stabilizing 
where f < n — 1, t = 2m, and 
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min (d(v, b)) < d(r, v) 

b&B 

min (d(v, b)) < d(r, v) 
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Figure [T] provides an example of these containment areas. Finally, we show that these containment 
areas are in fact optimal (see Theorem [5] and [6]) . 




Figure 1: Example of containment areas for BFS spanning tree construction. 

5.1 Impossibility results 

Theorem 1 Even under the central daemon, there exists no (c, 1) -strictly stabilizing protocol for 
BFS spanning tree construction where c is any (finite) integer. 

Proof This result is a direct application of Theorem 4 of [17] (note that the specification of BFS 
tree construction is LJ-restrictive in the worst case where D is the diameter of the system). D 

Theorem 2 Even under the central daemon, there exists no {t,c,l) -strongly stabilizing protocol 
for BFS spanning tree construction where t and c are any (finite) integers. 

Proof Let t and c be (finite) integers. Assume that there exists a (t, c, l)-strongly stabilizing 
protocol V for BFS spanning tree construction under the central daemon. Let S = (V, E) be the 
following system V = {po = r,pi, . . . ,p2c+2,P2c+3 = b} and E = {{pi,pi+i},i € {0, . . . , 2c + 2}}. 
Process pq is the real root and process 6 is a Byzantine one. 

Assume that the initial configuration po of S satisfies: levelr = levelb = 0, prntr = prnt^ = _L 
and other variables of b (if any) are identical to those of r (see Figure [21) . Assume now that b takes 
exactly the same actions as r (if any) immediately after r (note that d{r, b) > c and hence levelr = 
and prntr = J- still hold by closure and then levelh = and prnth = _L still hold too). Then, by 
symmetry of the execution and by convergence of V to spec, we can deduce that the system reaches 
in a finite time a configuration pi (see Figure [2]) in which: \/i € {1, . . . ,c + l},levelp- = i and 
prntp- = pi-i and Vi € {c + 2, . . . , 2c + 2}, levelp. = 2c + 3 — i and prntp^ = pi^i (because this 
configuration is the only one in which all correct process v such that d{v, b) > c satisfies spec{v) 
when levelr = levelb = and prntr = prntf, = _L). Note that pi is 0-legitimate and 0-stable and a 
fortiori c-legitimate and c-stable. 
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Figure 2: Configurations used in proof of Theorem [2l 

Assume now that the Byzantine process acts as a correct process and executes correctly V. 
Then, by convergence of V in fault-free systems (remember that a (t, c, l)-strongly stabilizing 
protocol is a special case of self-stabilizing protocol), we can deduce that the system reaches in 
a finite time a configuration p2 (see Figure [2]) in which: \/i G {l,...,2c + 3},levelp^ = i and 
prntp. = Pi-i (because this configuration is the only one in which every process v satisfies spec{v)). 
Note that the portion of execution between pi and p2 contains at least one c-perturbation (pc+2 
is a c-correct process and modifies at least once its 0-variables) and that p2 is 0-legitimate and 
0-stable and a fortiori c-legitimate and c-stable. 

Assume now that the Byzantine process b takes the following state: levels, = and prntf, = +. 
This step brings the system into configuration pa (see Figure [2|). From this configuration, we can 
repeat the execution we constructed from po- By the same token, we obtain an execution of V which 
contains c-legitimate and c-stable configurations (see pi) and an infinite number of c-perturbation 
which contradicts the (t, c, l)-strong stabilization of V. D 

5.2 Byzantine containment properties of the min + 1 protocol 

In the min + 1 protocol, as in many self-stabilizing tree construction protocols, each process v 
checks locally the consistence of its levely variable with respect to the one of its neighbors. When 
it detects an inconsistency, it changes its prnt^u variable in order to choose a "better" neighbor. 
The notion of "better" neighbor is based on the global desired property on the tree (here, the BFS 
requirement implies to choose one neighbor with the minimum level). 

When the system may contain Byzantine processes, they may disturb their neighbors by pro- 
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viding alternatively "better" and "worse" states. 

The min + 1 protocol chooses an arbitrary one of the "better" neighbors (that is, neighbors with 
the minimal level). Actually this strategy allows us to achieve the (5'b,/)-TA strict stabilization 
but is not sufficient to achieve the (t, S^j, /)-TA strong stabilization. To achieve the (t,5'^,/)-TA 
strong stabilization, we must bring a slight modification to the protocol: we choose a "better" 
neighbor with a round robin order (among its neighbors with the minimal level). 

Algorithm 15.11 presents our BFS spanning tree construction protocol SSBJ-'S which is both 
(5b,/)-TA strictly and {t,S^, f)-TA strongly stabilizing (where f < n — 1 and t = 2m) provided 
that the root is never Byzantine. 

algorithm 5.1 SSBJ-S: A TA strictly and TA strongly stabilizing protocol for BFS tree construc- 
tion 

Data: 

Ni,: totally ordered set of neighbors of v 

Variables: 

prnty E Ny U {J-}: pointer on the parent of v in the tree. 
levely € N: integer 

Macro: 

For any subset A C N^, choose{A) returns the first element of A which is bigger than prnty 
(in a round-robin fashion). 

Rules: 

(Rr) :: {v = r) A {{prnty 7^ _L) V {levely 7^ 0)) — )■ prnty := _L; levely := 

(Rv) ■■: {v ^ r) A [ {prnty = _L) V {levely / leveLmu + 1) V {levelpmu / inin{levelq}) 

\ qeNv 

prnty := choose I < p S Ny levelp = min{levelq} > I ; levely := levelprnu + 1 



In the following of this section, we provide proofs of topology-aware strict and strong stabiliza- 
tion of SSBJ-S. First at all, remember that the real root r can not be a Byzantine process by 
hypothesis. Note that the subsystems whose set of nodes are respectively V \Sb and V \S*q are 
connected by construction. 

{SB,n — 1)-TA strict stabilization 

Given a configuration p £ C and an integer d G {0, . . . , D}, let us define the following predicate: 



Id{p) =\fv G V, levely > min < d, min {d{v, u)} 

y «eBU{r} 

Lemma 1 For any integer d G {0, . . . , D}, the predicate Id is closed. 

Proof Let d be an integer such that d G {0, ...,D}. Let p G C be a configuration such that 
Id{p) = true and p' G C be a configuration such that p i->- p' is a step of SSBFS. 
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If the root process r ^ R (respectively a Byzantine process b (z R), then we have levelr = 
(respectively levelb > 0) in p' by construction of (Rr) (respectively by definition of levelb). Hence, 



levelr > min < d, min {d(r,u)} > = (respectively levelb > min < d, min {d(b,u)} > = 0). 

[ ueBU{r} J [ ueBU{r} J 

If a correct process v (z R satisfies v j^ r, then there exists a neighbor p of f which satisfies the 
following property in p (since v is activated and Id{p) = true): 



leveln = minUeveL} > min < d, min id(v,u)} 

^ qeN^ ^ { ueBU{r} 

Once, v is activated, we have: levels = leveln + l in p' . Let he 6 = min id(v, u)}. Then, we have: 

neBU{r} 

m,in {d{p, u)} > 6 — 1 (otherwise, we have a contradiction with the fact that 6 = min {d{v, u)} 

ueBU{r} ueBU{r} 

and that v and p are neighbors). Consequently, p' satisfies: 



levels = levelp + 1 > min < d, min {d{p, u)} > +1 

[ u£BU{r} 

> min{d, 6 — 1} + 1 

> min{d,5} 



> min id, min {d{v,u)} 
{ ueBU{r} 

We can deduce that Id{p') = true, that concludes the proof. D 

Let CC be the following set of configurations: 

CC = {/3 E C |(p is 5B-legitimate for spec) A {Id{p) = true) } 

Lemma 2 Any configuration of CC is {SB,n — 1)-TA contained for spec. 

Proof Let phe a configuration of CC. By construction, p is Ss-legitimate for spec. 

In particular, the root process satisfies: prntr = -L and levelr = 0. By construction of SSBJ-'S, 
r is not enabled and then never modifies its O- variables (since the guard of the rule of r does not 
involve the state of its neighbors). 

In the same way, any process v ^V \ {Sb U {r}) satisfies: prnty e Ny, levels = levelpmty + 1, 
and levelprnty = min{levelu}. Note that, as u € F \ {Sb U {r}) and spec{v) holds in p, we have: 

levels = d{v,r). Hence, process v is not enabled in p. It remains so until none of its neighbors u 
modifies its levelu variable to a value a such that a < levels — 2. 

Assume that there exists an execution e starting from p in which a neighbor n of a process 
V G V\{SB^{r}) modifies levelu to satisfy levelu < levelu— 2 (without loss of generality, assume that 
u is the first process to modify levelu in such a way in e). Note that min {d{u,p)} > d{v, r) — 1 

peBU{r} 

(otherwise, we have a contradiction with the fact that d{v,r) = min {d{v,p)} and that v and u 

peBU{r} 

are neighbors). Hence, we have: 

min {d{u,p)} > d{v,r) — l 

p£BU{r} 

> d{v,r) - 2 

> levelu 
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This contradicts the closure of predicate Id estabhshed in Lemma [TJ 

Consequently, there exists no such execution and process v remains infinitely disabled and then 
never modifies its 0-variables. This concludes the proof. D 

Lemma 3 Starting from any configuration, any execution of SSBTS reaches a configuration of 
CC in a finite time. 

Proof We are going to prove the following property by induction on d G {0, . . . , D}: 

[Vd)'- Starting from any configuration, any run of SSBJ-S reaches a configuration p such that 
Id{p) = true and in which any process v ^ Sb such that d{v,r) < d satisfies spec{v). 

Initialization: d = 0. 

Let p be an arbitrary configuration. Then, it is obvious that /o(p) is satisfied. 

If a process v ^ Sb satisfies d{v, r) < 0, then v = r. li v does not satisfy spec{v) in p, then v 
is continuously enabled. Since the scheduling is fair, v is activated in a finite time and then 
V satisfies spec{v) in a finite time. Then, we proved that (Vq) holds. 

Induction: d > 1 and Vd-i is true. 

We know, by Vd-i, that any run of SSBJ-S under a distributed fair scheduler reaches a 
configuration p such that Id~i{p) = true and in which any process v ^ Sb such that d{v, r) < 
d — 1 satisfies spec{v). 

l^ai Ed = \v (^V 



min {d{v,u)} > (i >. Note that Id-i{p) implies that V-y S Ed, levels > 

ueBU{r} 



d — 1 (since Wv G Ed, min < d — 1, min {d(v, u)} > = d — 1 hy construction). 

[ u£BU{r} J 

Note that any process v ^ Ed such that levels = d — 1 is enabled by {Rv) since we have: 
levelprnty > d — \ (by Id-i{p) and the fact that prnty is a neighbor of v) and thus levels = 
d — 1 < levelprntv + 1- Moreover, this rule remains enabled until v is activated by closure of 
Id~i{p) (see Lemma [T|). As the scheduling is fair, we deduce that any process v G Ed such 
that levels = d — 1 is activated in any run starting from p and levels > d holds. Then, we 
can conclude that any run starting from p reaches in a finite time a configuration p' such that 
hip') = true. 
Let V ^ Sb be a process such that d{r, v) = d. We distinguish the following two cases: 

Case 1: spec{v) holds in p' (and then levels = d). 

By closure of Id, any configuration of any run starting from p' satisfies Id. Moreover, 
V satisfies d{v,r) < min{d{v,u)}. Hence, there exists a BFS path from v to r. By 

u£B 

construction, process v is then not enabled (remind that any neighbor u of v satisfies: 



levelu > mini d, min {d{u,w)} > > d). In conclusion, v always satisfies spec{v) in 

[ w£BU{r} J 

any run starting from p' . 

Case 2: spec{v) does not hold in p' . 

By construction of p' , we can split N^ into two sets S and S such that any process u of 
S satisfies levelu = d{r, u) = d — 1 and spec{u) (and thus there exists a BFS path from 
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u to r) and any process u of 5 satisfies levelu > d (remind that Id{p') = true and then 
levelu > min { d, min {d(u,p)} )■ > d). 

[ peBU{r} J 

As spec{v) does not hold in p', we can deduce that v is enabled in p' . As Id is closed 
(by Lemma [1]), we can deduce that v remains enabled. Since the scheduling is fair, we 
conclude that v is activated in a finite time in any run starting from p' and then prnt^ 
is a process of S that implies that v satisfies spec{v) in a finite time in any run starting 
from p' . 

In conclusion, Vd is true, that ends the induction. 

Then, it is easy to see that Vd (where D is the diameter of the system) implies the result. D 

Theorem 3 SSBTS is a {SB,n — 1)-TA strictly stabilizing protocol for spec. 

Proof This result is a direct consequence of Lemmas [2] and [3l D 

(2m, S'^,n — 1)-TA strong stabilization Let be Eb = Sb \ S^ {i.e. Eb is the set of process v 
such that d(r,v) = min{d(v,b)}). 

beB 

Lemma 4 If p is a configuration of CC, then any process v E Eb is activated at most A^ times in 
any execution starting from p. 

Proof Let p be a configuration of CC and v a process of Eb- By construction, there exists a 
neighbor u oi v such that u £ V \ Sb- Then, we know that spec{u) holds in p. By Lemma [2l we 
are ensured that spec{u) remains true in any configuration of any execution starting from p. In 
particular, levelu = d{r,u). By closure of loip), we know that levelp > d{r,u) for any neighbor p 
of V. Consequently, levelu = min{levelq}. This implies that, ii prnty = u and levels = levelu + 1 

in a configuration p', then spec{v) is satisfied and v takes no actions in any execution starting from 

P'- 

Then, the construction of the macro choose implies that u is chosen as u's parent in at most 
At, actions of v. This implies the result. D 

Lemma 5 If p is a configuration of CC and v is a process such that v € Eb, then for any execution 
e starting from p either 

1. there exists a configuration p' of e such that spec{v) is always satisfied after p' , or 

2. v is activated in e. 

Proof Let p he a configuration of CC and t; be a process such that v £ Eb. By contradiction, 
assume that there exists an execution starting from p such that (i) spec{v) is infinitely often false 
in e and (ii) v is never activated in e. 

For any configuration p, let us denote by Pv{p) = ("^0 = v,vi = prnty,V2 = prnty^, . . . ,ffc = 
prntui^_^,Pu = prntu^) the maximal sequence of processes following pointers pmt (maximal means 
here that either prntp^ = _L or p^ is the first process such that there py = Vi for some i G {0, . . . ,k}). 

Let us study the following cases: 
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Case 1: prnt^ G 1/ \ ^^ in p. 

Since p G CC, prnty satisfies spec{prntv) in p and in any execution starting from p (by 
Lemma [2]). If v does not satisfy spec{v) in /?, then we have levels ^ levelpmty + 1 in p. 
Then, v is continuously enabled in e and we have a contradiction between assumption (ii) 
and the fairness of the scheduling. This implies that v satisfies spec{v) in p. The closure 
of Id (established in Lemma [T]) ensures us that v is never enabled in any execution starting 
from p. Hence, spec{v) remains true in any execution starting from p. This contradicts the 
assumption (i) on e. 

Case 2: prnty ^ F \ 5^ in p. 

By the assumption (i) on e, we can deduce that there exists infinitely many configurations 
p' such that a process of Pv{p') is enabled. By construction, the length of Pv{p') is finite 
for any configuration p' and there exists only a finite number of processes in the system. 
Consequently, there exists at least one process which is infinitely often enabled in e. Since 
the scheduler is fair, we can conclude that there exists at least one process which is infinitely 
often activated in e. 

Let ^e be the set of processes which are infinitely often activated in e. Note that v ^ A(, 
by assumption [ii) on e. Let e' = p' . . . be the suffix of e which contains only activations of 
processes of A^.. Let p be the first process of Pv{p') which belongs to A^ (p exists since at 
least one process of Py is enabled when spec{v) is false). By construction, the prefix of Pv{p") 
from V to p in any configuration p" of e remains the same as the one of Pi,{p'). Let p' be the 
process such that prntpi = p in e' {p' exists since v ^ p implies that the prefix of Pv{p') from 
V to p counts at least two processes) . As p is infinitely often activated and as any activation 
of p modifies the value of levelp (it takes at least two different values in e'), we can deduce 
that p' is infinitely often enabled in e' (since the value of levelp' is constant by construction 
of e' and p). Since the scheduler is fair, p' is activated in a finite time in e', that contradicts 
the construction of p. 

In the two cases, we obtain a contradiction with the construction of e, that proves the result. D 
Let CC* be the following set of configurations: 

CC* = {p £ C \{p is 5^-legitimate for spec) A {Id{p) = true) } 

Note that, as S]^ Q Sb, we can deduce that CC* C CC. Hence, properties of Lemmas H] and [5] 
also apply to configurations of CC* . 

Lemma 6 Any configuration of CC* is {2m,A,S'^,n — 1)-TA time contained for spec. 

Proof Let p be a configuration of CC* . As 5^ C Sb , we know by Lemma [2] that any process v of 

V \Sb satisfies spec{v) and takes no action in any execution starting from p. 

Let f be a process of Eb. By Lemmas H] and [SJ we know that v takes at most Ay actions 

in any execution starting from p. Moreover, we know that v satisfies spec{v) after its last action 

(otherwise, we obtain a contradiction between the two lemmas). Hence, any process of Eb takes at 

most A^ < A actions and then, there are at most ^ A^ = 2m S^-TA-disruptions in any execution 

vev 
starting from p. 

By definition of a TA time contained configuration, we obtain the result. D 
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Lemma 7 Starting from any configuration, any execution of SSBTS reaches a configuration of 
CC* in a finite time under a distributed fair scheduler. 

Proof Let p be an arbitrary configuration. We know by Lemma [3] that any execution starting from 
p reaches in a finite time a configuration p' of CC. 

Let f be a process of Eb- By Lemmas H] and O we know that v takes at most A^ actions in 
any execution starting from p' . Moreover, we know that v satisfies spec{v) after its last action 
(otherwise, we obtain a contradiction between the two lemmas). This implies that any execution 
starting from p' reaches a configuration p" such that any process v of Eb satisfies spec{v). It is 
easy to see that p" E CC*, that ends the proof. D 

Theorem 4 SSBJ^S is a {2m,S*^,n — 1)-TA strongly stabilizing protocol for spec. 

Proof This result is a direct consequence of Lemmas [6] and [71 D 

5.3 Optimality of containment areas of the min + 1 protocol 

Theorem 5 Even under the central daemon, there exists no {Ab, 1)-TA strictly stabilizing protocol 
for BFS spanning tree construction where Ab ^ Sb- 

Proof This is a direct application of the Theorem 2 of [9] . D 

Theorem 6 Even under the central daemon, there exists no {t,AB, 1)-TA strongly stabilizing pro- 
tocol for BFS spanning tree construction where Ab '^ Sb o^nd t is any (finite) integer. 

Proof Let P be a (t, Ab,1)-TA strongly stabilizing protocol for BFS spanning tree construction 
protocol where Ab $! S"^ and t is a finite integer. We must distinguish the following cases: 

Consider the following system: V = {r,u,u' ,v,v',b} and E = {{r,u},{r,u'}, {u,v},{u' ,v'}, 
{v , b} , {v' , b}} {b is a Byzantine process). We can see that S]^ = {v,v'}. Since Ab ^ Sb, we 
have: v ^ Ab or v' ^ Ab- Consider now the following configuration pq: prnty. = prntf, = _L, 
levelr = levelb = 0, prnt and level variables of other processes are arbitrary (see Figure [3l other 
variables may have arbitrary values but other variables of b are identical to those of r). 

Assume now that b takes exactly the same actions as r (if any) immediately after r. Then, by 
symmetry of the execution and by convergence of V to spec, we can deduce that the system reaches 
in a finite time a configuration pi (see Figure [3]) in which: prntr = prntb = -L, prntu = prntu' = r, 
prntv = prntjji = b, levelr = levels = and levelu = levelu' = levels = levely' = 1 (because this 
configuration is the only one in which all correct process v satisfies spec{v) when prntr = prntb = -L 
and levelr = levelb = 0). Note that pi is Td^-legitimate for spec and A^-stable (whatever Ab is). 

Assume now that b behaves as a correct process with respect to V. Then, by convergence of V 
in a fault-free system starting from pi which is not legitimate (remember that a strongly-stabilizing 
protocol is a special case of self-stabilizing protocol), we can deduce that the system reaches in a 
finite time a configuration p2 (see Figure [3]) in which: prntr = -L, prnt^ = prntu' = i^: prnty = u, 
prntyi = u' , prntb = '^ (or prntb = v'), levelr = 0, levelu = levels' = 1 levels = level^i = 2 and 
levelb = 3. Note that processes v and v' modify their 0-variables in the portion of execution between 
pi and p2 and that p2 is ^^-legitimate for spec and ^^-stable (whatever Ab is). Consequently, 
this portion of execution contains at least one ^^-TA-disruption (whatever Ab is). 

16 



Po 




Pi 




P2 




P3 




Figure 3: Configurations used in proof of Theorem [6l 

Assume now that the Byzantine process b takes the following state: prntf, = _L and levelb = 0. 
This step brings the system into configuration p3 (see Figure [3|). From this configuration, we 
can repeat the execution we constructed from pQ. By the same token, we obtain an execution 
of V which contains c- legitimate and c-stable configurations (see pi) and an infinite number of 
^B-TA-disruption (whatever Ab is) which contradicts the {t,AB,l)-TA strong stabilization ofV. 
D 

6 Conclusion 

In this article, we are interested in the BFS spanning tree construction in presence of both systemic 
transient faults and permanent Byzantine failures. As this task is global, it is impossible to solve 
it in a strictly stabilizing way. We proved then that there exists no solution to this problem even 
if we consider the weaker notion of strong stabilization. 

Then, we provide a study of Byzantine containment properties of the well-known min + 1 
protocol. This protocol is one of the simplest self-stabilizing protocols which solve this problem. 
However, we prove that it achieves optimal area containment with respect to the notion of topology- 
aware strict and strong stabilization. All our results are summarized in the above table. 



17 





BFS spanning tree construction 


(c, /)-strict stabilization 
(for any c and /) 


Impossible 
(Theorem [Tl 


(i, c, /)-strong stabilization 
(for any i, c, /) 


Impossible 
(Theorem [2J 


{Ab, f)-TA strict stabilization 
(for any / and Ab "J Sb) 


Impossible 
(Theorem E} 


(S's,/)-TA strict stabilization 
(for < / < n - 1) 


Possible 
(Theorem O 


{t,AB,f)-TA strong stabilization 
(for any /, t and Ab 51 Sb) 


Impossible 
(Theorem El 


[t, Sb, /)-TA strong stabilization 
(for < / < n - 1) 


Possible 
(Theorem H t = 2m) 



Using the result of [10] about r-operators, we can easily extend results of this paper to some 
others problems as depth-search or reliability spanning trees. This work raises the following open 
questions. Has any other global static task as leader election or maximal matching a topology-aware 
strictly or/and strongly stabilizing solution ? We can also wonder about non static tasks as mutual 
exclusion (recall that local mutual exclusion has a strictly stabilizing solution provided by [17]). 
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